-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-27754][K8S] Introduce additional config (spark.kubernetes.driver.request.cores) for driver request cores for spark on k8s #24630
Conversation
Spark on k8s supports config for specifying the executor cpu requests (spark.kubernetes.executor.request.cores) but a similar config is missing for the driver. Apparently `spark.driver.cores` works but its not evident that this accepts fractional values (its defined as an Integer config but apparently accepts decimals). To keep in sync with the executor config a similar driver config can be introduced (spark.kubernetes.driver.request.cores) for explicitly specifying the driver CPU requests. If not provided, the value will default to `spark.driver.cores` as before.
Test build #105472 has finished for PR 24630 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@mccheah ? |
conf.get(KUBERNETES_DRIVER_REQUEST_CORES).get | ||
} else { | ||
driverCpuCores | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for making a PR, @arunmahadevan . Could you rewrite like the following one-liner?
- private val driverCoresRequest = if (conf.contains(KUBERNETES_DRIVER_REQUEST_CORES)) {
- conf.get(KUBERNETES_DRIVER_REQUEST_CORES).get
- } else {
- driverCpuCores
- }
+ private val driverCoresRequest = conf.get(KUBERNETES_DRIVER_REQUEST_CORES.key, driverCpuCores)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. I just followed the pattern used somewhere else in the code.
.configurePod(basePod) | ||
.container.getResources | ||
.getRequests.asScala | ||
assert(requests1("cpu").getAmount === "1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid this assumption? You had better get the default value of DRIVER_CORES
and compare with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, as per the existing logic the default value would be "1" if spark.driver.cores
is not set and not the default value of spark.driver.cores
which also happens to be 1. I did not want to change that logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arunmahadevan . What I meant was this constant assertion, === "1"
. If we change the default value of that configuration, this test will fails. We had better get and use the default value from the conf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dongjoon-hyun , the logic in BasicDriverFeatureStep
private val driverCpuCores = conf.get(DRIVER_CORES.key, "1")
, sets the value of driverCpuCores
to "1" if spark.driver.cores
is not set in the spark conf. i.e. we don't use the DRIVER_CORES.defaultValue
there. Let me know if my understanding is correct.
If so I cannot do something like assert(requests1("cpu").getAmount === DRIVER_CORES.defaultValue)
here (assume this is what you are suggesting).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be the following because we should not have a magic number in the code.
- private val driverCpuCores = conf.get(DRIVER_CORES.key, "1")
+ private val driverCpuCores = conf.get(DRIVER_CORES)
val driverCpuQuantity = new QuantityBuilder(false)
- .withAmount(driverCpuCores)
+ .withAmount(driverCpuCores.toString)
Please fix like the above. And don't use magic numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
.configurePod(basePod) | ||
.container.getResources | ||
.getRequests.asScala | ||
assert(requests3("cpu").getAmount === "100m") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't mind, could you avoid repetitions like the following?
Seq("0.1", "100m").foreach { value =>
sparkConf.set(KUBERNETES_DRIVER_REQUEST_CORES, value)
val requests = new BasicDriverFeatureStep(KubernetesTestConf.createDriverConf(sparkConf))
.configurePod(basePod)
.container.getResources
.getRequests.asScala
assert(requests("cpu").getAmount === value)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure.
BTW, @arunmahadevan . The PR title, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I don't know the history of the decision, I'll take a look again after revision. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is worth noting that pod templates supports overriding spark container cpu requests: https://github.com/apache/spark/blob/master/docs/running-on-kubernetes.md#pod-template
If you pass a pod spec template with a single container in it with resource requests set, spark will use that as a base in building pod resources.
Ah, right. As @onursatici pointed out,
Since there is a more general way, the benefit of this PR is very limited. Please update the PR description, @arunmahadevan . Or, could you enhance K8s document about this instead of adding this 3rd way of |
Updated title to |
I agree we have an alternate way of specifying the driver cpu request cores for spark on k8s via the pod template. However we already have |
Test build #105500 has finished for PR 24630 at commit
|
I didn't suggest removing, @arunmahadevan . That kind of PR will be considered negatively too in the same way. You know, we usually are conservative for both sides (adding and removing).
|
In this case I think we will get more consistent by introducing a config specific to k8s driver for which the executor config already exists. |
Test build #105509 has finished for PR 24630 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for updates, @arunmahadevan . I updated the PR description with #24630 (comment).
I'll wait for the other reviewers' comments.
Merged to master. Thank you, @arunmahadevan and @felixcheung . |
Thank you @dongjoon-hyun for taking time to review and merge. |
Very inflexible,I'm looking for some config option to do such
maybe Unfortunately, it doesn't work. |
This should be done through mapping, not one by one. |
The code is not elegant enough to support "unknown" k8s configuration items (why not use rule mapping instead of enumerating it one by one?). |
…er.request.cores) for driver request cores for spark on k8s Spark on k8s supports config for specifying the executor cpu requests (spark.kubernetes.executor.request.cores) but a similar config is missing for the driver. Instead, currently `spark.driver.cores` value is used for integer value. Although `pod spec` can have `cpu` for the fine-grained control like the following, this PR proposes additional configuration `spark.kubernetes.driver.request.cores` for driver request cores. ``` resources: requests: memory: "64Mi" cpu: "250m" ``` Unit tests Closes apache#24630 from arunmahadevan/SPARK-27754. Authored-by: Arun Mahadevan <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
…er.request.cores) for driver request cores for spark on k8s Spark on k8s supports config for specifying the executor cpu requests (spark.kubernetes.executor.request.cores) but a similar config is missing for the driver. Instead, currently `spark.driver.cores` value is used for integer value. Although `pod spec` can have `cpu` for the fine-grained control like the following, this PR proposes additional configuration `spark.kubernetes.driver.request.cores` for driver request cores. ``` resources: requests: memory: "64Mi" cpu: "250m" ``` Unit tests Closes apache#24630 from arunmahadevan/SPARK-27754. Authored-by: Arun Mahadevan <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
Spark on k8s supports config for specifying the executor cpu requests
(spark.kubernetes.executor.request.cores) but a similar config is missing
for the driver. Instead, currently
spark.driver.cores
value is used for integer value.Although
pod spec
can havecpu
for the fine-grained control like the following, this PR proposes additional configurationspark.kubernetes.driver.request.cores
for driver request cores.How was this patch tested?
Unit tests